Fifth - virtual machine
Table of Contents
1. Overview
Fifth is a hobby operating system built around a custom stack-based virtual CPU. The virtual CPU emulator runs as a COM executable (emulator.com) under MS-DOS.
The virtual machine is designed to be simple yet powerful, featuring two stacks (data stack and return stack), approximately 50 instructions (opcodes), and a theoretical 4GB flat address space. It serves as the foundation for the Fifth language — a Forth-inspired, interactive programming environment where users can define, extend, and modify system behavior at runtime.
2. x86 Unreal Mode Execution
The emulator runs as a standard DOS COM executable and uses x86 real-mode assembly with a technique called "unreal mode" to access extended memory beyond the 640KB conventional DOS limit. In unreal mode, the CPU operates in real-mode (compatible with DOS) but with segment registers configured to allow access to the full 32-bit address space (up to 4GB).
Key aspects:
- Starts in real mode for DOS compatibility.
- Switches temporarily to x86 protected mode to initialize segment registers for 4 GB RAM flat address space.
- Returns to real mode while retaining 4 GB RAM access.
- Uses XMS (Extended Memory Specification) for large memory allocation.
2.1. Memory Management and XMS
The emulator uses DOS XMS (Extended Memory Specification) to allocate large blocks of extended memory:
- XMS Allocation: Allocates the largest available block of extended memory using HIMEM.SYS
- Memory Locking: Locks the allocated block to prevent swapping
- Address Mapping: Uses xms_addr variable to store the linear address of allocated memory
- Virtual-to-Physical: All virtual CPU addresses are offset by xms_addr to convert to physical memory addresses
3. Two-Stack Architecture
The virtual CPU uses two separate stacks:
- Data Stack: Main working stack for operations.
- Return Stack (variable resp): Stores return addresses for subroutine calls.
4. Graphics System
The virtual machine integrates directly with VESA BIOS for graphics output. It uses VESA 640x480 8-bit color mode.
Graphics operations are performed through dedicated VM instructions.
5. Boot Process
- DOS loads emulator.com as a COM executable
- system_init() allocates XMS memory and sets up A20 gate
- Sets VESA graphics mode (640x480, 8-bit color)
- Loads first kilobyte of kernel (core.raw) from disk into beginning of virtual RAM.
- Jumps to kernel entry point
- Kernel loads remaining 3 kibibytes of kernel bytecode into RAM. So initial bytecode part of the kernel must not exceed 4 kibibytes.
- Kernel loads high-level Fifth boot code into RAM.
- Kernel starts compiling additional modules to itself from high-level bytecode to gain filesystem support, dynamic RAM support and other essential functions.
- Additional modules (keyboard / mouse drivers) and interactive shell are now loaded from filesystem files.
- Enters interactive REPL
6. Implemented instructions
Virtual CPU opcodes (most of them are available as directly executable commands in FIFTH programming language):
| # | Name | Stack effect | Description | Notes |
|---|---|---|---|---|
| 0 | nop | -- | No operation; execution continues to next instruction | Used for padding or placeholder |
| 1 | halt | -- | Stop execution and exit emulator | Cleanly terminates the Fifth environment |
| 2 | kbd@ | – scancode | Read keyboard scancode; returns 0 if no key pending | Make codes for press, break codes (value+128) for release |
| 3 | num | – n | Push 32-bit literal from instruction stream | Followed by 4 bytes (little-endian) |
| 4 | jmp | -- | Unconditional jump to address from instruction stream | Followed by 4-byte target address |
| 5 | call | -- | Call subroutine; push return address on return stack | Use ret (opcode 11) to return; follow with 4-byte target address |
| 6 | 1+ | n – n+1 | Increment top of data stack by 1 | Efficient single-byte operation for counters and pointers |
| 7 | 1- | n – n-1 | Decrement top of data stack by 1 | Efficient single-byte operation for countdown loops |
| 8 | dup | n – n n | Duplicate top of data stack | Preserve values before consuming operations |
| 9 | drop | n -- | Remove and discard top of data stack | Clean up stack after operations |
| 10 | if | n -- | Jump to address if top of stack is zero | |
| 11 | ret | Return from subroutine (pop return address) | ||
| 12 | c@ | addr – byte | Read byte from memory at specified address | |
| 13 | c! | byte addr -- | Store byte to specified memory address | |
| 14 | push | n -- | Move top of data stack to return stack | |
| 15 | pop | – n | Move top of return stack to data stack | |
| 16 | <unused> | |||
| 17 | rot | n1 n2 n3 – n2 n3 n1 | Rotate top three stack elements | |
| 18 | disk@ | sector addr -- | Read 1KB from disk into RAM at specified address | |
| 19 | disk! | addr sector -- | Write 1KB from RAM to disk at specified sector | |
| 20 | @ | addr – n | Read 32-bit value from memory | |
| 21 | ! | n addr -- | Store 32-bit value to memory | |
| 22 | over | n1 n2 – n1 n2 n1 | Duplicate second item on data stack | |
| 23 | swap | n1 n2 – n2 n1 | Swap top two items on data stack | |
| 24 | + | n1 n2 – n1+n2 | Add top two items on data stack | |
| 25 | - | n1 n2 – n1-n2 | Subtract second item from top item on data stack | TODO: verify argument order |
| 26 | * | n1 n2 – n1*n2 | Multiply top two items on data stack | |
| 27 | / | n1 n2 – n2/n1 | Divide top item by second item on data stack | TODO: verify argument order |
| 28 | > | n1 n2 – result | Compare n1 > n2 (returns true, false otherwise) | TODO: document, what true and false means |
| 29 | < | n1 n2 – result | Compare n1 < n2 (returns true, false otherwise) | TODO: document, what true and false means |
| 30 | not | n – ~n | Bitwise NOT on top of data stack | |
| 31 | i | – n | Push top of return stack to data stack | |
| 32 | cprt@ | port – byte | Read byte from hardware I/O port | |
| 33 | cprt! | byte port -- | Write byte to hardware I/O port | |
| 34 | i2 | – n | Push second item from return stack to data stack | |
| 35 | i3 | – n | Push third item from return stack to data stack | |
| 36 | shl | value count – result | Left shift value by count bits; shifted-out bits discarded | |
| 37 | shr | value count – result | Logical right shift; zeros shifted in, shifted-out bits discarded | |
| 38 | or | n1 n2 – result | Bitwise OR of top two items on data stack | |
| 39 | xor | n1 n2 – result | Bitwise XOR of top two items on data stack | |
| 40 | vidmap | addr -- | Copy memory to video memory | |
| 41 | mouse@ | – x y buttons | Read mouse coordinates and button states | |
| 42 | vidput | addr1 addr2 x y -- | Blit image1 to image2 at (x,y) without transparency | |
| 43 | cmove | addr1 addr2 len -- | Copy memory block from addr1 to addr2 | |
| 44 | cfill | byte addr len -- | Fill memory with specified byte value | |
| 45 | tvidput | addr1 addr2 x y -- | Blit image1 to image2 at (x,y) with transparency support | |
| 46 | depth | – depth | Push current data stack depth | |
| 47 | charput | fg bg src dest x y -- | Draw character from source buffer to destination buffer |